Overwatch heroes Lucio, Wrecking Ball and Soldier 76 in action
Our dataset consists of information on competitive gamers who play the video game Overwatch on Playstation 4. Overwatch is a team-based multiplayer first-person shooter developed and published by Blizzard Entertainment. Overwatch assigns players into two teams of six, with each player selecting from a roster of 30 characters, known as “heroes”, each with a unique style of play whose roles are divided into three general categories that fit their role. Players on a team work together to secure and defend control points on a map or escort a payload across the map in a limited amount of time.
Overwatch has a large community and E-Sports presence online. Players’ skill in competitive games is calculated by a “secret” formula at Blizzard that leads to a “skill rating”, or “SR” for short. SR ranges from 0 to 5,000, the higher the score the better the player.
Among the community, SR is divided into categories depending on how high the rating is, ranging from Bronze to Grandmaster:
We scraped a snapshot of PS4 players’ SR (it changes from game to game) whose profiles were public on overwatchtracker.com. We then scraped players’ career statistics from the games that they’ve played from the open source API ovrstat.com which returns convenient JSON formatted data.
Currently we have over two thousand player skill ratings and over two thousand predictor variables. However, our research question will allow us to tailor our question to a small subset of the predictors, around 60 or so.
Overall we’re interested in the question: if a player wants to improve their SR, what should they focus on? Should they try to eliminate more opponents? Heal their teammates? Or, play a certain character? Answers like these will be provided by a predictive model of SR using career player statistics as predictors. The answers we find will allow any player to most efficiently improve their SR and begin climbing their way to Grandmaster!
The answers we find could be used by amateurs and pro Overwatch gamers alike. We think of our analysis as the start of something like “Moneyball” for Overwatch.
df = read_csv('data/clean-data.csv')
relevel top hero - brian
glimpse - kai
df %>% glimpse()
## Observations: 2,316
## Variables: 63
## $ skill_rating <dbl> 2535, 3815, 2909, 1735, 2…
## $ assists.defensiveAssists <dbl> 208, 26, 5, 557, 412, 0, …
## $ assists.healingDone <dbl> 113872, 461978, 16315, 33…
## $ assists.offensiveAssists <dbl> 16, 683, 30, 303, 448, 8,…
## $ average.allDamageDoneAvgPer10Min <dbl> 10696, 11647, 10539, 7440…
## $ average.barrierDamageDoneAvgPer10Min <dbl> 2132, 4929, 2498, 2264, 3…
## $ average.deathsAvgPer10Min <dbl> 7.41, 6.47, 8.67, 6.89, 8…
## $ average.eliminationsAvgPer10Min <dbl> 22.31, 20.30, 20.87, 18.1…
## $ average.finalBlowsAvgPer10Min <dbl> 12.76, 11.25, 13.11, 6.18…
## $ average.healingDoneAvgPer10Min <dbl> 878.00, 4094.00, 500.00, …
## $ average.heroDamageDoneAvgPer10Min <dbl> 8223, 6490, 7773, 4977, 6…
## $ average.objectiveKillsAvgPer10Min <dbl> 9.18, 8.15, 6.62, 8.69, 8…
## $ average.objectiveTimeAvgPer10Min <dbl> 52, 78, 39, 76, 72, 45, 8…
## $ average.soloKillsAvgPer10Min <dbl> 3.53, 2.14, 3.34, 1.10, 2…
## $ average.timeSpentOnFireAvgPer10Min <dbl> 76, 76, 68, 69, 55, 154, …
## $ best.allDamageDoneMostInGame <dbl> 27906, 29403, 19787, 1565…
## $ best.barrierDamageDoneMostInGame <dbl> 12549, 14998, 6376, 7599,…
## $ best.defensiveAssistsMostInGame <dbl> 10, 20, 2, 29, 36, 0, 35,…
## $ best.eliminationsMostInGame <dbl> 60, 56, 40, 40, 52, 24, 6…
## $ best.environmentalKillsMostInGame <dbl> 1, 5, 2, 4, 2, 0, 2, 1, 2…
## $ best.finalBlowsMostInGame <dbl> 36, 30, 30, 17, 35, 12, 3…
## $ best.healingDoneMostInGame <dbl> 3354, 9320, 3072, 13869, …
## $ best.heroDamageDoneMostInGame <dbl> 20399, 17370, 14664, 1038…
## $ best.killsStreakBest <dbl> 60, 56, 40, 40, 52, 24, 6…
## $ best.meleeFinalBlowsMostInGame <dbl> 4, 5, 3, 1, 1, 3, 7, 5, 2…
## $ best.multikillsBest <dbl> 4, 3, 4, 3, 5, 3, 4, 4, 3…
## $ best.objectiveKillsMostInGame <dbl> 30, 35, 18, 23, 29, 9, 31…
## $ best.objectiveTimeMostInGame <dbl> 280, 377, 169, 257, 369, …
## $ best.offensiveAssistsMostInGame <dbl> 5, 46, 13, 27, 21, 8, 20,…
## $ best.soloKillsMostInGame <dbl> 36, 30, 30, 17, 35, 12, 3…
## $ best.teleporterPadsDestroyedMostInGame <dbl> 1, 1, 0, 0, 3, 0, 1, 1, 3…
## $ best.timeSpentOnFireMostInGame <dbl> 482, 353, 305, 307, 518, …
## $ best.turretsDestroyedMostInGame <dbl> 22, 11, 3, 11, 9, 0, 4, 1…
## $ combat.barrierDamageDone <dbl> 276462, 556147, 81535, 12…
## $ combat.damageDone <dbl> 1066469, 732220, 253681, …
## $ combat.deaths <dbl> 961, 730, 283, 368, 1064,…
## $ combat.eliminations <dbl> 2894, 2291, 681, 967, 228…
## $ combat.environmentalKills <dbl> 1, 32, 3, 9, 20, 0, 5, 2,…
## $ combat.finalBlows <dbl> 1655, 1269, 428, 330, 102…
## $ combat.heroDamageDone <dbl> 1066469, 732220, 253681, …
## $ combat.meleeFinalBlows <dbl> 33, 153, 4, 4, 2, 3, 29, …
## $ combat.multikills <dbl> 24, 11, 6, 4, 30, 1, 4, 1…
## $ combat.objectiveKills <dbl> 1190, 920, 216, 464, 1040…
## $ combat.objectiveTime <dbl> 6692, 8793, 1263, 4066, 9…
## $ combat.soloKills <dbl> 458, 241, 109, 59, 266, 3…
## $ combat.timeSpentOnFire <dbl> 9869, 8592, 2217, 3660, 6…
## $ game.gamesLost <dbl> 55, 59, 14, 21, 56, 0, 14…
## $ game.gamesTied <dbl> 3, 2, 1, 0, 3, 0, 0, 2, 0…
## $ game.gamesWon <dbl> 56, 41, 13, 29, 53, 1, 20…
## $ game.timePlayed <dbl> 77817, 67698, 19581, 3205…
## $ matchAwards.cards <dbl> 58, 26, 7, 27, 29, 1, 11,…
## $ matchAwards.medals <dbl> 354, 395, 68, 143, 269, 5…
## $ matchAwards.medalsBronze <dbl> 97, 143, 20, 41, 95, 2, 4…
## $ matchAwards.medalsGold <dbl> 169, 112, 33, 54, 83, 2, …
## $ matchAwards.medalsSilver <dbl> 88, 140, 15, 48, 91, 1, 4…
## $ miscellaneous.teleporterPadsDestroyed <dbl> 5, 4, 0, 0, 9, 0, 1, 1, 3…
## $ miscellaneous.turretsDestroyed <dbl> 134, 59, 16, 67, 74, 0, 2…
## $ assists.reconAssists <dbl> 2, 0, 1, 0, 4, 0, 0, 2, 0…
## $ best.reconAssistsMostInGame <dbl> 2, 0, 1, 0, 4, 0, 0, 2, 0…
## $ top_hero <fct> soldier76, roadhog, mccre…
## $ games_played <dbl> 114, 61, 79, 104, 193, 67…
## $ rank <dbl> 26698, 1062, 13572, 46831…
## $ top_hero_type <chr> "damage", "tank", "damage…
summary - kai
bar chart of top players - brian
df %>% group_by(top_hero_type, top_hero) %>% summarise(n = n()) %>%
ggplot(., aes(y=n, x=reorder(top_hero, n), fill=top_hero_type)) +
geom_bar(stat = "identity") +
coord_flip() +
theme_minimal() +
labs(x='Hero', y='Number of players', title='Most popular heroes to play') +
scale_fill_manual(values = c("#999999", "#E69F00", "#56B4E9"))
full additive model without redundant variables - brian
fail shapiro test, show qq plot - brian
pairs plots response ~ 5 predictors - kai
correlation - brian
log(gp) - brian
step bic + aic - kai
show r^2 - kai
2 anova tests - kai
library(readr)
clean_df <- read_csv("data/clean-data.csv")
## Parsed with column specification:
## cols(
## .default = col_double(),
## top_hero = col_character(),
## top_hero_type = col_character()
## )
## See spec(...) for full column specifications.
predictors = colnames(clean_df)[1:60]
predictors = append(predictors, "games_played")
for (name in predictors) {
plot(as.formula(paste("skill_rating ~ ", paste(name))), data = clean_df)
}
## Warning in plot.formula(as.formula(paste("skill_rating ~ ",
## paste(name))), : the formula 'skill_rating ~ skill_rating' is treated as
## 'skill_rating ~ 1'
diagnostics of final model - brian
2 anova tests - kai
So, what should a player who wants to improve their skill rating focus on?